background noise
- Asia > Middle East > Iran (0.14)
- North America > The Bahamas (0.14)
- North America > Canada > Alberta (0.14)
- (17 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology (1.00)
- (4 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Mobile (0.94)
- Information Technology > Artificial Intelligence (0.68)
- North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Asia > China > Jilin Province (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.05)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds
Cauzinille, Jules, Miron, Marius, Pietquin, Olivier, Hagiwara, Masato, Marxer, Ricard, Rey, Arnaud, Favre, Benoit
Self-supervised speech models have demonstrated impressive performance in speech processing, but their effectiveness on non-speech data remains underexplored. We study the transfer learning capabilities of such models on bioacoustic detection and classification tasks. We show that models such as HuBERT, WavLM, and XEUS can generate rich latent representations of animal sounds across taxa. We analyze the models properties with linear probing on time-averaged representations. We then extend the approach to account for the effect of time-wise information with other downstream architectures. Finally, we study the implication of frequency range and noise on performance. Notably, our results are competitive with fine-tuned bioacoustic pre-trained models and show the impact of noise-robust pre-training setups. These findings highlight the potential of speech-based self-supervised learning as an efficient framework for advancing bioacoustic research.
RECTor: Robust and Efficient Correlation Attack on Tor
Wu, Binghui, Divakaran, Dinil Mon, Csikor, Levente, Gurusamy, Mohan
Tor is a widely used anonymity network that conceals user identities by routing traffic through encrypted relays, yet it remains vulnerable to traffic correlation attacks that deanonymize users by matching patterns in ingress and egress traffic. However, existing correlation methods suffer from two major limitations: limited robustness to noise and partial observations, and poor scalability due to computationally expensive pairwise matching. To address these challenges, we propose RECTor, a machine learning-based framework for traffic correlation under realistic conditions. RECTor employs attention-based Multiple Instance Learning (MIL) and GRU-based temporal encoding to extract robust flow representations, even when traffic data is incomplete or obfuscated. These embeddings are mapped into a shared space via a Siamese network and efficiently matched using approximate nearest neighbor (aNN) search. Empirical evaluations show that RECTor outperforms state-of-the-art baselines such as DeepCorr, DeepCOFFEA, and FlowTracker, achieving up to 60% higher true positive rates under high-noise conditions and reducing training and inference time by over 50%. Moreover, RECTor demonstrates strong scalability: inference cost grows near-linearly as the number of flows increases. These findings reveal critical vulnerabilities in Tor's anonymity model and highlight the need for advanced model-aware defenses.